In Pursuit of Beauty 


- Statistics 2: Real Data Shared - 


A Historic Note 


This statue of the goddess Aphrodite was originally 
carved by Menides of Antioch somewhere between 
130 B.C.E. and 90 B.C.E. A peasant named Yorgos 
Kentrotas found her inside a buried niche within the 
ruins of the ancient city of Milos, on the Aegean 
island of Melos in 1820. 


Originally she would have stood in her niche in the 
wall of a gymnasium where she would have been 
colourfully painted and wearing jewellery. She was 
gazing at an apple that she held aloft in her left 


hand. Her right arm clutched her garment. 


The statue was carved from seven blocks of marble 
and although the arms were recovered, an “expert” 
at that time didn’t think they quite matched. 


The statue would have been taken to a kiln to be 
reduced to lime, however, she was purchased by a 
French naval officer, Oliver Voutier, who took her 


home to Paris. She is now displayed in the Musee 


du Louvre with the Romanised name Venus de Milo. 


The Australian Bureau of Statistics 

Among the various services the Australian Bureau of Statistics provides is the 
collection and processing of millions of census forms. The Bureau also provides a 
clearinghouse for business surveys. The ABS wishes to see its statistics widely 
used as it contributes to its mission of “assisting and encouraging informed 
decision making, research and discussion”. It therefore takes quite seriously the 
task of educating, and assisting those who educate, about statistics. We are 
grateful for their cooperation when preparing this lesson as the ABS data 


mentioned here is used with permission from the Australian Bureau of Statistics. 


The entire contents of the ABS website is now accessible free of charge from their 


website at <www.abs.gov.au>. 
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The ABS website 

Open the ABS website at www.abs.gov.au. 

It is very easy to be side-tracked by the many interesting topics displayed on the 
front page, so please follow these directions for the moment and explore more 


widely at a later time. 


About CensusAtSchool 

You can learn a lot more about CensusAtSchool if you click on the menu items to 
the left of the screen. Very briefly, teachers can ask students from years 5 to 12 
to go online and answer a series of questions about themselves. The ABS keeps a 
database of the responses and anyone can go online and download a random 
sample of those responses. Teachers can also download very useful materials 
such as Question Sheets that help their students work through and understand 


the data they download. 


The main links bar is just below the ABS logo and header. It starts “Home, 
Statistics, Services We Provide, ... etc. Toward the end of the bar is the word 


“Education”. Please click on that link. 


The Education Services page has a link “For Teachers” and a link “For Students”. 
Follow the link for students where you will see even more distractions to visit 


later. Follow the link to Real Data and the next link to CensusAtSchool Data. 


When you follow the link to Random Sampler, please read the page carefully and 


then click the A*e8t) button. 


Getting Data from the ABS website to a Spreadsheet 

While some teachers will choose to do this step for you, by describing it here, we 
hope you will search the net more widely using your machine at home. You will 
need the software that came with your ClassPad installed on a computer running 
a Windows operating system. You will need a spreadsheet program such as 
Microsoft Excel or Lotus 123. You will also need access to the internet via browser 


software such as Microsoft Internet Explorer. 


Follow the directions given above until you find the Random Sampler page. 


Where you are asked to enter a sample size, always get as large a sample size as 


possible when you are doing statistics. In this case, choose 200. 
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CensusAtSchool Home 


| CensusAtSchool wi STUDENT AREA 


1. Sample Size 


About CensusAtSchool 
Teacher Area 
= TEACHER RESOURCES 
Student Area 
Questionnaire .¢Random Sampler 
Random Sampler 
= e To create and generate your random data sample follow these steps: 
Statistics - what is it? 
Library 1. Enter your sample size. Your data sample can contain between 10 and 200 data records. 
Careers 2. Select your desired sample characteristics using the drop down lists. 
3. Click the ‘Get Data Sample’ button. 
| News 
| Results 
Contact CensusAtSchool 


Enter sample size: 


2. Sample characteristics 


Select Sample Size iz up to a maximum of 200. 


Select gender: 


Select postcode range: 
Select year levels: 


Select dominant hand: 
Select size of school: 


3. Get Data Sample 


All i) 
0 to 9999 
r- Select as many year levels as you wish to include 


(- All year levels 
© Year 4 or below 
© Year 5 

- Year6 

) Year 7 

2 Year 8 

- Year9 

M Year 10 

( Year 11 

© Year 12 


Cs.) 
——s 


(Cancel) (Get Data Sample ) 


You will notice that you can put several different limitations on the data selected. 


You can choose a particular gender or year group and you can use the postcode 


range to limit your study to a particular area. I chose to limit the data to just 


those responses from Year 10 students. 


When you click on the “Get Data Sample” button you are taken to a screen that 


summarises the fields you asked for and offers you two files to download. You will 


need both of them. The cas_variables_sheet.xls is a regular Excel file that gives 


you a detailed description of each field in the spreadsheet and translates the 


codes that have been used to distinguish the various responses. It is best to print 


this file so that you can refer to it easily. 


The cas_sample_xxxx.csv file is a special type of spreadsheet file that travels 


fairly safely on poorer quality internet connections. Each field is separated using a 


comma. Hence the name of this format: Comma Separated Values. 
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If I load the csv file into a Word Processor, the first bit looks like this. 


Notice that each item is delimited by a comma. 


G (oR s) =) cas_sample_55322400.csv.txt 


3981 ,Q62 ,Q82 ,083 064 ,085 086 ,Q67 088 085 ,Q85 085 068 ,Q88 085 065,085,085 ,Q69 016,011 ,Q12 013,014 ,Q15,Q1 
6 ,Q17 ,Q18 ,Q19 026 ,Q21 022 023 024 25 ,026 ,026 ,026 ,026 ,026 ,026 ,026 ,026 027 026 026 026 028 028,028 ,028 0 
28 ,028 ,028 ,028 ,028 ,029 ,038 038,038,034 ,036 ,038 038 038,038 ,031 ,032 ,033 ,033 ,033 ,033 ,033 ,033 ,033 ,033 ,Q34, 
034 ,034 ,034 ,034 ,034 ,034 ,034 ,034 034,035 ,035 ,035 ,035 ,035 ,035 ,035 ,036 ,036 ,036 ,036 ,036 ,036 ,036 ,036 ,036 036 
3937 ,Q37 ,Q37 ,Q37 ,Q37 ,Q37 ,Q37 037 038 038 ,038 ,039 039,039,039 ,039 ,Q39 ,Q39 ,Q39 ,Q39 ,046 048,048 046 048,04 
6,048 

Qtime Sex ,BrthMnth ,BrthYear ,BrthP lce ,BedRms ,Peop LHme , InetAxs ,NumPts ,PtsNone ,PtsDog ,PtsCat ,PtsMouse ,PtsR 
abt ,PtsHrse ,PtsBird ,PtsRpt le ,PtsFish ,PtsOther ,SpkHome ,Hand ,RighT ime ,Lef Time ,ColrEye ,Height ,RighFoot ,B1 1 
yButn ,GupSel ,GupNoScl ,BedScl ,BedNoScl ,PostCode ,WhreLive ,SclSex ,SclNum,¥rLevel ,TrvylSame,TrvlMtrC ,TrvlBik 
e,TrvlBus,TrvlTram,TrvlCar,TrvlWalk,Try¥lOthr ,TmeToSel ,TANone , TAFshChp, TAFdChic , TAFruit , TAHamBgr , TAMidEs 
t, TAPie, TAPizza,TARice,TAROLL,TASalad,TAOther ,HveBrkFs ,BrkDdNot ,BrkBread ,BrkBar ,BrkCeral ,BrkMeat ,BrkMil 
k ,BrkLolli,BrkSftDr ,BrkTea ,Aft¥ri2,TimeConc ,UseMobPh ,UseCompG ,UseIntrn ,UseGmeBy ,UseMP3 ,UseRadio ,UseTY ,U 
seXbox ,HrFrnd ,HrPhone ,HrHmeWrk ,HrFami ly ,HrSport ,Hr¥idGme ,HrComp ,HrSMS ,HrPaylrk ,HrHseWrk , IntSurfW, IntBuy 
SL, IntChat,IntEmail,IntSchl ,IntMusDn, IntPlyGm,MusC Las ,MusCntry ,MusHMet | ,Musdazz ,MusPunk ,MusRap ,MusRegg, 
MusRandB ,MusRock ,MusTchno ,HveBlt ,HveCap ,HveDyeHr ,HveEarRg ,HveBdyPr ,HveRing ,HveSunGl ,HveTats ,HolSt ,HolIn 
tSt ,HolOvrSe ,EnSavOGF ,EnConMar ,EnRedPak ,EnRecRub ,EnRedWtr ,EnRedGHG ,EnC LimCh ,EnUseOcn ,EnImpTur ,WtrPlnty, 
WerHiQua ,WtrGrdn ,Wtrl LnRi ,WtrHseLe ,WtrResPm ,WerNwHse 

24 Female September ,1991 ,Queensland,4,18,¥es - broadband connection, 
4,No,No,No,No,No,No,Yes,No,No,No,English,Right handed ,@.23,6.2,Blue,165,24,99,Before 6:68am,7:38 am, 
9:68 pm,11:68 pm,4161,Qld,Boys and Girls,More than 1508,Year 18,No,No,No,Yes,No,Yes,No,No, 
38,No,No,No,Yes,No,No,No,No,No,Yes,Yes,No,At home,No,No,No,Yes,No,Yes,No,No,No,Other, 
22,No,No,¥es,No,No,Yes,Yes,No,More than 7 hours,Less than 1 hour,4-7 hours,More than 7 hours,4-7 
hours,Less than 1 hour,1-3 hours,1-3 hours,Less than 1 hour ,4-7 

hours ,Sometimes ,Never ,Never ,Often,Often,Never ,Never ,No,No,No,Yes,No,Yes,No,No,No,No,No,Yes,No,Yes,No,No 
sNo,No,Once a year,Less than once a 

year ,Never ,Agree ,Agree ,Agree ,Agree ,Agree ,Agree ,Agree ,Agree ,Agree ,Disagree ,Disagree ,Agree ,Agree ,Agree ,Ag 
ree Agree 


When you click on the link, your browser software will probably know that you 


need a spreadsheet and will open Excel for you. If you have a recent version of 


Excel, it will know what to do with the csv file and will open it straight away as a 


regular spreadsheet file. However, your version of Excel may startup asking how 


to interpret the csv file. The only thing that you need to remember is that when 


your spreadsheet asks you to identify delimiters, you tick the box next to comma. 


Other than that just keep on clicking “Next” until you get to click “Finish”. 


Preparing a spreadsheet 
At this stage you should have a spreadsheet that starts a bit like this. 


You may care to compare the two screen dumps on this page and see how Exce 
reads the .csv file to form the .xls file shown above. If you are wondering about 


the first column, it is the time taken for people to complete the questionnaire. 
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608 
B Cc | D E | F | G 
eal Qol Qo2 Qo2 Qo3 Qo4 Qo5 
2 |Qtime Sex BrthMnth BrthYear BrthPice BedRms PeoplHme 
3 24 Female September 1991 Queensland o 10 
4 | 16 Female February 1991 Other African country 4 6 
5 | 15 Male September 1991 Western Australia 3 o 
6 | 14 Female September 1990 South Australia 4 4 
7 14 Female October 1990 South Australia 3 4 
8 16 Female January 1991 New South Wales 20 68 
9 | 10 Female November 1990 Queensland 2 = 
10 | 18 Male August 1989 Queensland 4 5 


Save your Spreadsheet as Year10 ABS Data 

If you tried to transfer the whole spreadsheet to your ClassPad it wouldn’t fit. 

You need to make up smaller spreadsheets with just the fields you need. 

For example: 

Go to columns Z, AA and AB and select the three columns by dragging your 
mouse across the column headings. 

Copy the columns by pressing the “c” key while you hold down the control key. 
From the File Menu select New Workbook. 

When a blank spreadsheet appears click on cell Al and paste the columns into the 
new sheet by pressing the “v” key while you hold down the control key. 


Click on the column header for column B. From the Edit Menu choose Delete. 
Cell A2 should contain the variable name Height. 

Cell B2 should contain the variable name BllyButn. 

Save your New Workbook as Height. 


Preparing a Second Spreadsheet 
Open your file Yeari0 ABS Data. 
Click on column heading B. Copy the column using Control-C. 


From the File Menu select New Workbook. 


Click on cell Ai and paste into the New Workbook by pressing Control-V. 

Cell A2 should contain the variable name Sex. 

Copy column DC from Year10 ABS Data into column B of the New Workbook. 
Cell B2 should contain the variable name HveDyeHr. 


Save the New Workbook as Hair. 


Transferring your Spreadsheets to your ClassPad Emulator 

Check that the file SportQ is still available on your classPad from the time you 
took the lesson called Any Contingency. Switch off your ClassPad. 

Close down your computer. Connect your ClassPad to your computer. 

Boot your computer and then turn on your ClassPad. 

Open your ClassPad Manager software and you should be able to see four 
columns on the screen, two for the ClassPad and two for the computer. 

Open the ClassPad icon and then open the “OtherData” folder. 

Double click on the folder called “Main”. 


The contents of “Main” should now appear in the second ClassPad column. 
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Open the Computer icon. This will show your ClassPad emulator. 

Open the emulator and then open the “Other Data” and then the “Main” folder. 
Drag the icon for SportQ into second emulator column. 

Close the ClassPad Manager. 

Open your ClassPad Emulator and choose Spreadsheet. 

Open the file SportQ. 

Open the Excel spreadsheet Hair. 

In the spreadsheet use your mouse to select the cells from A3 to B202 

Now click very close to the edge of the highlighted area and the cursor ae | 
should change to the “Click and Drag” cursor that looks like this: Te 
Simply click and drag the whole highlighted area across to cell Al of 

your ClassPad emulator. 


The screens should now look like the example below. 


Pee © [AY |= nl [>] 


GARGEGNGRGRGGr 


Notice that you have over-written the data from the Belfast spreadsheet with 


data from the ABS spreadsheet. 


Before you do anything else, Save your Emulator file as Hair. 


On your emulator you now have two files called SportQ and Hair. 


Transferring the other spreadsheet file 
From the ClassPad Emulator File Menu choose New and click OK. 
Open the Excel file Height and Copy the first two columns as before. 


Save your Emulator file as Height. 


E) ClassPad Manager - Belfast =a 


Copyright 2007, Hartley Hyde Page 6 of 12 


Transferring your Spreadsheets to your ClassPad 

Open your ClassPad Manager software. 

Open the ClassPad icon and then open the “OtherData” folder. 

Double click on the folder called “Main”. 

The contents of “Main” should now appear in the second ClassPad column. 
Open the Computer icon. This will show your ClassPad Emulator. 

Open the Emulator and then open the “Other Data” and then the “Main” folder. 
This folder should now contain the files SportQ, Hair and Height. 

Drag the icon for Hair into the second ClassPad column. 

Drag the icon for Height into the same ClassPad column. 

Close the ClassPad Manager. 

Close down your computer and switch off your ClassPad. 


Disconnect your ClassPad. 


Processing the Hair File 

The data in the first column of your spreadsheet describes student’s gender and 
the second column lists the responses “Yes” or “No” to Question 37 that asks 
respondents if they have dyed hair. There may be some correlation between 


gender and having dyed hair but the expected matrix of the Contingency Table 


will assume there is no correlation. 


We are looking to see how much the actual data varies from the expected. 


Switch on your ClassPad and open the spreadsheet | co | —E | F [Gg 
PActual]  ¥) NT 
file Hair. Your file should look like this: Adelaide] BT 


You may think it’s a disaster but remember that your 
contingency table is still looking for the words 


“Adelaide” and “Belfast” in column A and it is 


looking for the letters “Y” and “N” in column B. 


Go to cell D2 and type the word “Female”. 


Go to cell D3 and type the word “Male”. 


Go to cell E1 and type the word “Yes”. 
Go to cell Fl and type the word “No”. 


Your contingency table should now look a bit like this. 
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We expect that the data that you download will be rather different to the data I 


downloaded—it is a random selection. Copy your tables into these spaces. 


Actual Yes No 


Female 


Male 


Expected Yes No 


Female 


Male 


Comment on how well your data matches the expectation table. 


Comment on how well your data matches your expectation of Year 10s you know. 


How could you tell if some of the students were just “Mucking around”. 


Save your spreadsheet file “Hair”. 


Checkpoint sd 
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Processing the Height Data 
Open the spreadsheet file “Height”. 


You have two columns of numbers. 


If you read the field codes in the ABS spreadsheet 
cas_variables_sheet.xls you will find that the 
numbers in column A are the student’s height in cm. 
The numbers in column B are the height of the 
corresponding “belly buttons”. Presumably the ABS 
discovered that younger students understand the 


term “belly button” better than “navel”. 


While we can understand a serious government agency taking an interest in 
peoples height or foot-length one does start to wonder about measuring the 


height of belly buttons. However, there is serious purpose behind all this. 


In 1854 the German scientist Zeising issued the book Neue Lehre von den 
Proportionen des menschlichen Korpers that developed the theory of 
proportionality to human anatomy. He proposed that the ratio of people’s height 


to the height of their navel is in the same ratio as the Golden Ratio or Divine 


Proportion (@ = 1.62). 


On the front page of this lesson is a picture of the Venus de Milo. When I printed 
the picture I measured the height of her image at about 26 cm and the height of 
her navel at about 16 cm. The ratio of 26 to 16 is about 1.625. If you measure 
some other great works of art you will get similar answers. Is this just a case of 
artists using the Golden Ratio as a “rule of thumb” or did their models actually 


have those measurements? How, for that matter, do Year 10 students shape up? 


There is so much hype around about the Golden Ratio that it is worth reading an 
article at http://www.maa.org/devlin/devlin_06_04.html This article 
explains why some of the more far fetched proposals are actually true and 


suggests why some suggestions, such as this, gain credibility. 
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ClassPad Time 
This should be quite easy. 


Use the [E+] row insertion tool to insert a blank row at row 1. 


This forces the data down one row to the position A2:B201. 

At cell Al type “Height”, at B1 type “BllyBtn” and at Ci type “Ratio”. 
At cell C2 type the formula: =A2/B2 and then press @«). 
From the Edit Menu select Fill Range and edit the range to C2:C201. 


When you tap OK, the column fills with numbers that 
look excitingly close to 1.6. 


Now select column C by tapping the column heading 
and from the graph menu select Box Whisker and you 
probably get ... nothing! 

And where did that 150 spring from? 


Now you will have a different selection of data than I 


downloaded, but for this question, there are likely to 


be plenty of students who gave stupid answers. 
That is what happens when people don’t understand why a question is important. 
Imagine the stupid answers some people write on census forms. This happens all 


the time when we collect statistics so we have to learn to cope with it. 


Select all the data in the range A2:C201. 

From the Edit Menu choose Sort. Fill in the Sort 
Dialogue Box giving the Range as A2:C201 and the 
Key column as C. When you press OK, the stupid 
answers go to the top or bottom of the heap. 

Here we need to distinguish between those who made 
reasonable guesses and those that gave stupid 
answers. I figured that items at B189 and B190 


looked like guesses and then there was a bigger gap 


Al91-BL2e1 


before the really stupid answers switched in. 


Therefore I deleted all the rows after row 190. You will need to examine your 
data very carefully because you may need to delete quite different lines. 


You may even find some stupid answers at the top of column C. 
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Time for another try 


Now select column C by tapping the column heading and from the Graph Menu 


select Box Whisker and this time you get a proper Box and Whisker Plot. 


From the Series Menu tick the Show Outliers box and you may see that there are 


still a number of cases where the measurements are unlikely to be accurate. 


In this space copy your own Box and Whisker Plot. 


Your estimate of: 


the Median is 


Qi is 


Q3 is 


From the Graph Menu select Histogram. 


Copy your Histogram into this space. 
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Use this evidence to comment on Dr 


Zeising’s theory of human anatomy. 


Checkpoint 


Checkpoints 
Hair Checkpoint 


F 
ma] 
TF 
119 
| 
| 


4) 64.26] 193 
6) S474] oe 


m 


Any null hypothesis that the Boys and Girls have 


similar hair care requirements is defeated by a y’ test 


on my data. However, less senior students will need 


= 
1 OO] ut] 
Mi e/a] 00) Wi 


to argue in terms of remoteness of their results from 


0] fa] 
ad] 03) 


i) | 


expectation. The answers to the other questions will 


by i) | 
Gy CO) 0) cy s 
Gy Ch] hay 00) 


give some idea of student’s understanding of the 


issues. 


Height Checkpoint 
Check that students are able to draw the graphs using their ClassPad and that 


their estimates are reasonably accurate for their data. 


Keith Devlin’s comments may be useful to understand the issues here. 
Another claim is that if you measure the distance from the tip of your head to the floor and divide 


that by the distance from your belly button to the floor, you get @. But this is nonsense. First of 


all, you won't get exactly the number @. You never can; @ is irrational, remember. But in the 


case of measuring the human body, there is a lot of variation. True, the answers will always be 
fairly close to 1.6. But there's nothing special about 1.6. Why not say the answer is 1.603? 
Besides, there's no reason to divide the human body by the navel. If you spend a half an hour or 
so taking measurements of various parts of the body and tabulating the results, you will find any 


number of pairs of figures whose ratio is close to 1.6, or 1.5, or whatever you want. 


When I downloaded data from Year 11 and Year 12 girls I was amazed how 
accurate the Zeising hypothesis is. There were 25 girls with ‘amazingly divine 


proportions’. 


Then one of the outliers gave the answer away. When their teacher decided the 
fastest way to find the height of their navel was to divide by 9, one of the girls 
multiplied instead. And so her measurements show her navel floating 
harmoniously above her head. I was also interested that there is a much higher 
proportion of stupid answers to this question. Without explanation, some senior 


girls clearly resent the question. 


Can I conclude this page by recommending the educative and supportive work of 
the ABS to you? Check out their web page, get your own class to enter data and 


enjoy using their materials. 
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